5 research outputs found

    A step towards Advancing Digital Phenotyping In Mental Healthcare

    Get PDF
    Smartphones and wrist-wearable devices have infiltrated our lives in recent years. According to published statistics, nearly 84% of the world’s population owns a smartphone, and almost 10% own a wearable device today (2022). These devices continuously generate various data sources from multiple sensors and apps, creating our digital phenotypes. This opens new research opportunities, particularly in mental health care, which has previously relied almost exclusively on self-reports of mental health symptoms. Unobtrusive monitoring using patients’ devices may result in clinically valuable markers that can improve diagnostic processes, tailor treatment choices, provide continuous insights into their condition for actionable outcomes, such as early signs of relapse, and develop new intervention models. However, these data sources must be translated into meaningful, actionable features related to mental health to achieve their full potential. In the mental health field, there is a great need and much to be gained from defining a way to continuously assess the evolution of patients’ mental states, ideally in their everyday environment, to support the monitoring and treatments by health care providers. A smartphone-based approach may be valuable in gathering long-term objective data, aside from the usually used self-ratings, to predict clinical state changes and investigate causal inferences about state changes in patients (e.g., those with affective disorders). Being objective does not imply that passive data collection is also perfect. It has several challenges: some sensors generate vast volumes of data, and others cause significant battery drain. Furthermore, the analysis of raw passive data is complicated, and collecting certain types of data may interfere with the phenotype of interest. Nonetheless, machine learning is predisposed to address these matters and advance psychiatry’s era of personalised medicine. This work aimed to advance the research efforts on mobile and wearable sensors for mental health monitoring. We applied supervised and unsupervised machine learning methods to model and understand mental disease evolution based on the digital phenotype of patients and clinician assessments at the follow-up visits, which provide ground truths. We needed to cope with regularly and irregularly sampled, high-dimensional, and heterogeneous time series data susceptible to distortion and missingness. Hence, the developed methods must be robust to these limitations and handle missing data properly. Throughout the various projects presented here, we used probabilistic latent variable models for data imputation and feature extraction, namely, mixture models (MM) and hidden Markov models (HMM). These unsupervised models can learn even in the presence of missing data by marginalising the missing values in the function of the present observations. Once the generative models are trained on the data set with missing values, they can be used to generate samples for imputation. First, the most probable component/state has to be found for each sample. Then, sampling from the most probable distribution yields valid and robust parameter estimates and explicit imputed values for variables that can be analysed as outcomes or predictors. The imputation process can be repeated several times, creating multiple datasets, thereby accounting for the uncertainty in the imputed values and implicitly augmenting the data. Moreover, they are robust to moderate deviations of the observed data from the assumed underlying distribution and provide accurate estimates even when missingness is high. Depending on the properties of the data at hand, we employed feature extraction methods combined with classical machine learning algorithms or deep learning-based techniques for temporal modelling to predict various mental health outcomes - emotional state, World Health Organisation Disability Assessment Schedule (WHODAS 2.0) functionality scores and Generalised Anxiety Disorder-7 (GAD-7) scores, of psychiatric outpatients. We mainly focused on one-size-fits-all models, as the labelled sample size per patient was limited; however, in the mood prediction case, it was possible to apply personalised models. Integrating machines and algorithms into the clinical workflow require interpretability to increase acceptance. Therefore, we also analysed feature importance by computing Shapley additive explanations (SHAP) values. SHAP values provide an overview of essential features in the machine learning models by designating the weight of predictability of each feature positively or negatively to the target variable. The provided solutions, as such, are proof of concept, which require further clinical validation to be deployable in the clinical workflow. Still, the results are promising and lay some foundations for future research and collaboration among clinicians, patients, and computer scientists. They set the paths to advance future research prospects in technology-based mental healthcare.En los últimos años, los smartphones y los dispositivos y pulseras inteligentes, comúnmente conocidos como wearables, se han infiltrado en nuestras vidas. Según las estadísticas publicadas a día de hoy (2022), cerca del 84% de la población tiene un smartphone y aproximadamente un 10% también posee un wearable. Estos dispositivos generan datos de forma continua en base a distintos sensores y aplicaciones, creando así nuestro fenotipo digital. Estos datos abren nuevas vías de investigación, particularmente en el área de salud mental, dónde las fuentes de datos han sido casi exclusivamente autoevaluaciones de síntomas de salud mental. Monitorizar de forma no intrusiva a los pacientes mediante sus dispositivos puede dar lugar a marcadores valiosos en aplicación clínica. Esto permite mejorar los procesos de diagnóstico, adaptar tratamientos, e incluso proporcionar información continua sobre el estado de los pacientes, como signos tempranos de recaída, y hasta desarrollar nuevos modelos de intervención. Aun así, estos datos en crudo han de ser traducidos a datos interpretables relacionados con la salud mental para conseguir un máximo rendimiento de los mismos. En salud mental existe una gran necesidad, y además hay mucho que ganar, de definir cómo evaluar de forma continuada la evolución del estado mental de los pacientes en su entorno cotidiano para ayudar en el tratamiento y seguimiento de los mismos por parte de los profesionales sanitarios. En este ámbito, un enfoque basado en datos recopilados desde sus smartphones puede ser valioso para recoger datos objetivos a largo plazo al mismo tiempo que se acompaña de las autoevaluaciones utilizadas habitualmente. La combinación de ambos tipos de datos puede ayudar a predecir los cambios en el estado clínico de estos pacientes e investigar las relaciones causales sobre estos cambios (por ejemplo, en aquellos que padecen trastornos afectivos). Aunque la recogida de datos de forma pasiva tiene la ventaja de ser objetiva, también implica varios retos. Por un lado, ciertos sensores generan grandes volúmenes de datos, provocando un importante consumo de batería. Además, el análisis de los datos pasivos en crudo es complicado, y la recogida de ciertos tipos de datos puede interferir con el fenotipo que se quiera analizar. No obstante, el machine learning o aprendizaje automático, está predispuesto a resolver estas cuestiones y aportar avances en la medicina personalizada aplicada a psiquiatría. Esta tesis tiene como objetivo avanzar en la investigación de los datos recogidos por sensores de smartphones y wearables para la monitorización en salud mental. Para ello, aplicamos métodos de aprendizaje automático supervisado y no supervisado para modelar y comprender la evolución de las enfermedades mentales basándonos en el fenotipo digital de los pacientes. Estos resultados se comparan con las evaluaciones de los médicos en las visitas de seguimiento, que proporcionan las etiquetas reales. Para aplicar estos métodos hemos lidiado con datos provenientes de series temporales con alta dimensionalidad, muestreados de forma regular e irregular, heterogéneos y, además, susceptibles a presentar patrones de datos perdidos y/o distorsionados. Por lo tanto, los métodos desarrollados deben ser resistentes a estas limitaciones y manejar adecuadamente los datos perdidos. A lo largo de los distintos proyectos presentados en este trabajo, hemos utilizado modelos probabilísticos de variables latentes para la imputación de datos y la extracción de características, como por ejemplo, Mixture Models (MM) y hidden Markov Models (HMM). Estos modelos no supervisados pueden aprender incluso en presencia de datos perdidos, marginalizando estos valores en función de las datos que sí han sido observados. Una vez entrenados los modelos generativos en el conjunto de datos con valores perdidos, pueden utilizarse para imputar dichos valores generando muestras. En primer lugar, hay que encontrar el componente/estado más probable para cada muestra. Luego, se muestrea de la distirbución más probable resultando en estimaciones de parámetros robustos y válidos. Además, genera imputaciones explícitas que pueden ser tratadas como resultados. Este proceso de imputación puede repetirse varias veces, creando múltiples conjuntos de datos, con lo que se tiene en cuenta la incertidumbre de los valores imputados y aumentándose así, implícitamente, los datos. Además, estas imputaciones son resistentes a desviaciones que puedan existir en los datos observados con respecto a la distribución subyacente asumida y proporcionan estimaciones precisas incluso cuando la falta de datos es elevada. Dependiendo de las propiedades de los datos en cuestión, hemos usado métodos de extracción de características combinados con algoritmos clásicos de aprendizaje automático o técnicas basadas en deep learning o aprendizaje profundo para el modelado temporal. La finalidad de ambas opciones es ser capaces de predecir varios resultados de salud mental/estado emocional, como la puntuación sobre el World Health Organisation Disability Assessment Schedule (WHODAS 2.0), o las puntuaciones del generalised anxiety disorder-7 (GAD-7) de pacientes psiquiátricos ambulatorios. Nos centramos principalmente en modelos generalizados, es decir, no personalizados para cada paciente sino explicativos para la mayoría, ya que el tamaño de muestras etiquetada por paciente es limitado; sin embargo, en el caso de la predicción del estado de ánimo, puidmos aplicar modelos personalizados. Para que la integración de las máquinas y algoritmos dentro del flujo de trabajo clínico sea aceptada, se requiere que los resultados sean interpretables. Por lo tanto, en este trabajo también analizamos la importancia de las características sacadas por cada algoritmo en base a los valores de las explicaciones aditivas de Shapley (SHAP). Estos valores proporcionan una visión general de las características esenciales en los modelos de aprendizaje automático designando el peso, positivo o negativo, de cada característica en su predictibilidad sobre la variable objetivo. Las soluciones aportadas en esta tesis, como tales, son pruebas de concepto, que requieren una mayor validación clínica para poder ser desplegadas en el flujo de trabajo clínico. Aun así, los resultados son prometedores y sientan base para futuras investigaciones y colaboraciones entre clínicos, pacientes y científicos de datos. Éstas establecen las guías para avanzar en las perspectivas de investigación futuras en la atención sanitaria mental basada en la tecnología.Programa de Doctorado en Multimedia y Comunicaciones por la Universidad Carlos III de Madrid y la Universidad Rey Juan CarlosPresidente: David Ramírez García.- Secretario: Alfredo Nazábal Rentería.- Vocal: María Luisa Barrigón Estéve

    Predicting emotional states using behavioral markers derived from passively sensed data: Data-driven machine learning approach

    Get PDF
    Background: Mental health disorders affect multiple aspects of patients’ lives, including mood, cognition, and behavior. eHealth and mobile health (mHealth) technologies enable rich sets of information to be collected noninvasively, representing a promising opportunity to construct behavioral markers of mental health. Combining such data with self-reported information about psychological symptoms may provide a more comprehensive and contextualized view of a patient’s mental state than questionnaire data alone. However, mobile sensed data are usually noisy and incomplete, with significant amounts of missing observations. Therefore, recognizing the clinical potential of mHealth tools depends critically on developing methods to cope with such data issues. Objective: This study aims to present a machine learning–based approach for emotional state prediction that uses passively collected data from mobile phones and wearable devices and self-reported emotions. The proposed methods must cope with high-dimensional and heterogeneous time-series data with a large percentage of missing observations. Methods: Passively sensed behavior and self-reported emotional state data from a cohort of 943 individuals (outpatients recruited from community clinics) were available for analysis. All patients had at least 30 days’ worth of naturally occurring behavior observations, including information about physical activity, geolocation, sleep, and smartphone app use. These regularly sampled but frequently missing and heterogeneous time series were analyzed with the following probabilistic latent variable models for data averaging and feature extraction: mixture model (MM) and hidden Markov model (HMM). The extracted features were then combined with a classifier to predict emotional state. A variety of classical machine learning methods and recurrent neural networks were compared. Finally, a personalized Bayesian model was proposed to improve performance by considering the individual differences in the data and applying a different classifier bias term for each patient. Results: Probabilistic generative models proved to be good preprocessing and feature extractor tools for data with large percentages of missing observations. Models that took into account the posterior probabilities of the MM and HMM latent states outperformed those that did not by more than 20%, suggesting that the underlying behavioral patterns identified were meaningful for individuals’ overall emotional state. The best performing generalized models achieved a 0.81 area under the curve of the receiver operating characteristic and 0.71 area under the precision-recall curve when predicting self-reported emotional valence from behavior in held-out test data. Moreover, the proposed personalized models demonstrated that accounting for individual differences through a simple hierarchical model can substantially improve emotional state prediction performance without relying on previous days’ data. Conclusions: These findings demonstrate the feasibility of designing machine learning models for predicting emotional states from mobile sensing data capable of dealing with heterogeneous data with large numbers of missing observations. Such models may represent valuable tools for clinicians to monitor patients’ mood states.This project has received funding from the European Union's Horizon 2020 Research and Innovation Program under the Marie Sklodowska-Curie grant agreement number 813533. This work was partly supported by the Spanish government (Ministerio de Ciencia e Innovación) under grants TEC2017-92552-EXP and RTI2018-099655-B-100; the Comunidad de Madrid under grants IND2017/TIC-7618, IND2018/TIC-9649, IND2020/TIC-17372, and Y2018/TCS-4705; the BBVA Foundation under the Domain Alignment and Data Wrangling with Deep Generative Models (Deep-DARWiN) project; and the European Union (European Regional Development Fund and the European Research Council) through the European Union's Horizon 2020 Research and Innovation Program under grant 714161. The authors thank Enrique Baca-Garcia for providing demographic and clinical data and assisting in interpreting and summarizing the data

    Shift in social media app usage during covid-19 lockdown and clinical anxiety symptoms: Machine learning-based ecological momentary assessment study

    Get PDF
    Background: Anxiety symptoms during public health crises are associated with adverse psychiatric outcomes and impaired health decision-making. The interaction between real-time social media use patterns and clinical anxiety during infectious disease outbreaks is underexplored. Objective: We aimed to evaluate the usage pattern of 2 types of social media apps (communication and social networking) among patients in outpatient psychiatric treatment during the COVID-19 surge and lockdown in Madrid, Spain and their short-term anxiety symptoms (7-item General Anxiety Disorder scale) at clinical follow-up. Methods: The individual-level shifts in median social media usage behavior from February 1 through May 3, 2020 were summarized using repeated measures analysis of variance that accounted for the fixed effects of the lockdown (prelockdown versus postlockdown), group (clinical anxiety group versus nonclinical anxiety group), the interaction of lockdown and group, and random effects of users. A machine learning–based approach that combined a hidden Markov model and logistic regression was applied to predict clinical anxiety (n=44) and nonclinical anxiety (n=51), based on longitudinal time-series data that comprised communication and social networking app usage (in seconds) as well as anxiety-associated clinical survey variables, including the presence of an essential worker in the household, worries about life instability, changes in social interaction frequency during the lockdown, cohabitation status, and health status. Results: Individual-level analysis of daily social media usage showed that the increase in communication app usage from prelockdown to lockdown period was significantly smaller in the clinical anxiety group than that in the nonclinical anxiety group (F1,72=3.84, P=.05). The machine learning model achieved a mean accuracy of 62.30% (SD 16%) and area under the receiver operating curve 0.70 (SD 0.19) in 10-fold cross-validation in identifying the clinical anxiety group. Conclusions: Patients who reported severe anxiety symptoms were less active in communication apps after the mandated lockdown and more engaged in social networking apps in the overall period, which suggested that there was a different pattern of digital social behavior for adapting to the crisis. Predictive modeling using digital biomarkers—passive-sensing of shifts in category-based social media app usage during the lockdown—can identify individuals at risk for psychiatric sequelae.JR was supported by the American Psychiatric Association 2021 Junior Psychiatrist Research Colloquium (NIDA R-13 grant). ES received funding from the European Union Horizon 2020 research and innovation program (Marie Sklodowska-Curie grant 813533). AA is supported by the Spanish Ministerio de Ciencia, Innovación y Universidades (RTI2018-099655-B-I00), the Comunidad de Madrid (Y2018/TCS-4705 PRACTICO-CM), and the BBVA Foundation (Deep-DARWiN grant)

    Automatic patient functionality assessment from multimodal data using deep learning techniques – Development and feasibility evaluation

    No full text
    Wearable devices and mobile sensors enable the real-time collection of an abundant source of physiological and behavioural data unobtrusively. Unlike traditional in-person evaluation or ecological momentary assessment (EMA) questionnaire-based approaches, these data sources open many possibilities in remote patient monitoring. However, defining robust models is challenging due to the data's noisy and frequently missing observations.This work proposes an attention-based Long Short-Term Memory (LSTM) neural network-based pipeline for predicting mobility impairment based on WHODAS 2.0 evaluation from such digital biomarkers. Furthermore, we addressed the missing observation problem by utilising hidden Markov models and the possibility of including information from unlabelled samples via transfer learning. We validated our approach using two wearable/mobile sensor data sets collected in the wild and socio-demographic information about the patients.Our results showed that in the WHODAS 2.0 mobility impairment prediction task, the proposed pipeline outperformed a prior baseline while additionally providing interpretability with attention heatmaps. Moreover, using a much smaller cohort via task transfer learning, the same model could learn to predict generalised anxiety severity accurately based on GAD-7 scores

    Continuous Assessment of Function and Disability via Mobile Sensing: Real-World Data-Driven Feasibility Study

    No full text
    BackgroundFunctional limitations are associated with poor clinical outcomes, higher mortality, and disability rates, especially in older adults. Continuous assessment of patients’ functionality is important for clinical practice; however, traditional questionnaire-based assessment methods are very time-consuming and infrequently used. Mobile sensing offers a great range of sources that can assess function and disability daily. ObjectiveThis work aims to prove the feasibility of an interpretable machine learning pipeline for predicting function and disability based on the World Health Organization Disability Assessment Schedule (WHODAS) 2.0 outcomes of clinical outpatients, using passively collected digital biomarkers. MethodsOne-month-long behavioral time-series data consisting of physical and digital activity descriptor variables were summarized using statistical measures (minimum, maximum, mean, median, SD, and IQR), creating 64 features that were used for prediction. We then applied a sequential feature selection to each WHODAS 2.0 domain (cognition, mobility, self-care, getting along, life activities, and participation) in order to find the most descriptive features for each domain. Finally, we predicted the WHODAS 2.0 functional domain scores using linear regression using the best feature subsets. We reported the mean absolute errors and the mean absolute percentage errors over 4 folds as goodness-of-fit statistics to evaluate the model and allow for between-domain performance comparison. ResultsOur machine learning–based models for predicting patients’ WHODAS functionality scores per domain achieved an average (across the 6 domains) mean absolute percentage error of 19.5%, varying between 14.86% (self-care domain) and 27.21% (life activities domain). We found that 5-19 features were sufficient for each domain, and the most relevant being the distance traveled, time spent at home, time spent walking, exercise time, and vehicle time. ConclusionsOur findings show the feasibility of using machine learning–based methods to assess functional health solely from passively sensed mobile data. The feature selection step provides a set of interpretable features for each domain, ensuring better explainability to the models’ decisions—an important aspect in clinical practice
    corecore